Applied causal inference methods for sequential mediators

Background Mediation analysis aims at estimating to what extent the effect of an exposure on an outcome is explained by a set of mediators on the causal pathway between the exposure and the outcome. The total effect of the exposure on the outcome can be decomposed into an indirect effect, i.e. the effect explained by the mediators jointly, and a direct effect, i.e. the effect unexplained by the mediators. However finer decompositions are possible in presence of independent or sequential mediators. Methods We review four statistical methods to analyse multiple sequential mediators, the inverse odds ratio weighting approach, the inverse probability weighting approach, the imputation approach and the extended imputation approach. These approaches are compared and implemented using a case-study with the aim to investigate the mediating role of adverse reproductive outcomes and infant respiratory infections in the effect of maternal pregnancy mental health on infant wheezing in the Ninfea birth cohort. Results Using the inverse odds ratio weighting approach, the direct effect of maternal depression or anxiety in pregnancy is equal to a 59% (95% CI: 27%,94%) increased prevalence of infant wheezing and the mediated effect through adverse reproductive outcomes is equal to a 3% (95% CI: -6%,12%) increased prevalence of infant wheezing. When including infant lower respiratory infections in the mediation pathway, the direct effect decreases to 57% (95% CI: 25%,92%) and the indirect effect increases to 5% (95% CI: -5%,15%). The estimates of the effects obtained using the weighting and the imputation approaches are similar. The extended imputation approach suggests that the small joint indirect effect through adverse reproductive outcomes and lower respiratory infections is due entirely to the contribution of infant lower respiratory infections, and not to an increased prevalence of adverse reproductive outcomes. Conclusions The four methods revealed similar results of small mediating role of adverse reproductive outcomes and early respiratory tract infections in the effect of maternal pregnancy mental health on infant wheezing. The choice of the method depends on what is the effect of main interest, the type of the variables involved in the analysis (binary, categorical, count or continuous) and the confidence in specifying the models for the exposure, the mediators and the outcome. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-022-01764-w.


Identifying assumptions for the direct and indirect effects
We use the notation A {B|C} to denote that A is independent of B conditional on C, and P (B|A, C) to denote the conditional probability of B given A and C. When B is continuous, the conditional probability is replaced by the conditional density. The identifying and estimating assumptions for the direct and indirect effects include: • the cross-world independence assumption: Y (am 1 m 2 ) {M 1 (a * ), M 2 (a * )}|C = c).

Selected methods for multiple mediation analysis
We will describe the following four methods: the inverse odds ratio weighting approach (IOR), the inverse probability weighting approach (IPW), the imputation approach and the extended imputation approach. Specifically we will consider that the exposure A, the mediators M 1 and M 2 , and the outcome Y are all binary. However, these methods can be implemented in scenarios with different combinations of continuous, categorical, count and binary variables.
Inverse odds ratio weighting approach The inverse odds ratio weighting approach estimates the counterfactual g{E[Y (a, M 1 (a * ), M 2 (a * , M 1 (a * )))|C = c]} (i.e the scenario under which the population is exposed to A = a but all the mediators take the natural value under the scenario A = a * ) by means of the following equality: (1) where W is the inverse of the conditional odds ratio function relating M 1 and M 2 to A within the levels of C: where P (A = a|M 1 = m 1 , M 2 = m 2 , C = c) is the conditional probability of the exposure given the two mediators and the covariates. Weighting each subject with the inverse odds ratio function (3) relating M 1 and M 2 to A within the levels of C makes A and {M 1 , M 2 } independent. To obtain more stabilised weights, one can multiply each individual's exposure-mediator odds ratio by the predicted odds of the exposure when the mediators are fixed at their reference value obtaining an inverse odds weight instead of inverse odds ratio weight.
In practice (when considering A, M 1 , M 2 , Y binary), to estimate the conditional total effect we model the mean observed outcome for each subject (Y ) conditional on the observed exposure (A) and the covariates (C) using a generalized linear regression model. The conditional total effect is then equal to the exponentiated coefficient for the exposure A if the interactions between A and C are not included in the model. To estimate the conditional pure direct effect we model the mean observed outcome for each subject (Y ) conditional on the observed exposure (A) and the covariates (C) using a weighted generalized linear regression model where weights W are equal to 1 for unexposed subjects and equal to the inverse odds ratio predicted by the logistic regression model of A given M 1 and M 2 and C for the exposed subjects. The conditional pure direct effect is then equal to the inverse of the exponentiated coefficient for the exposure A if the interactions between A and C are not included in the model. Finally to estimate the conditional total indirect effect we calculate the ratio between the estimated total effect and the estimated pure direct effect. Note that all the effects are estimated using the original data with no imputations (Table 1). Confidence intervals can be calculated by bootstrapping. Table 1: Inverse odds ratio weighting approach: example based on two subjects, one exposed (i=1) and the other one unexposed (i=2). A: exposure of the i -subiect, M 1 : first mediator of the i -subiect, M 2 : second mediator of the i -subiect, Y : outcome of the i -subiect.
Inverse probability weighting approach The inverse probability weighting approach estimates the marginal pure direct and total indirect effects. Specifically it estimates the three counterfactuals, g{E[Y (a, M 1 (a), M 2 (a, M 1 (a)))]}, g{E[Y (a * , M 1 (a * ), M 2 (a * , M 1 (a * )))]} and g{E[Y (a, M 1 (a * ), M 2 (a * , M 1 (a * )))]} by means of the following equalities: ))]} can be estimated from the observed data. The third counterfactual g E[Y (a, M 1 (a * ), M 2 (a * , M 1 (a * )))] , which includes both two potential outcomes under A = a and A = a * and cannot be obtained by the observed data, can still be estimated by standardising the mean outcome Y in each stratum defined by the mediators M 1 and M 2 and the confounders C among individuals exposed at the level A = a, to the mediator distribution of individuals exposed at the level A = a * and by weighting by the reciprocal of the conditional probability of the exposure A given the covariates C.
In practice (when considering A, M 1 , M 2 , Y binary), we fit to the observed data i) an outcome model g[E(Y |A = a, M 1 = m 1 , M 2 = m 2 , C = c)] conditional on the observed exposure A, the mediators M 1 and M 2 , and covariates C using a generalized linear regression model and, ii) an exposure model P (A|C = c) conditional on the observed covariates C using a logistic regression model to calculate the corresponding weights. We expand the observed data by repeating each observation in the original data set twice and we consider one additional variable A which is equal to the observed exposure A for the first replication and equal to the opposite of the observed exposure for the second replication (Table 2). When A is equal to the observed A, we estimate g{E[Y (1, M 1 (1), M 2 (1, M 1 (1)))]} and g{E[Y (0, M 1 (0), M 2 (0, M 1 (0)))]} from the outcome model by using the observed data. On the contrary, when A is different from the observed A, we estimate g{E[Y (1, M 1 (0), M 2 (0, M 1 (0)))]} from the outcome model by using the individual's own values of mediators M 1 and M 2 and confounders C in the unexposed subjects (A=0), but using A = 1, the opposite of the observed exposure A. Hence we calculate a weighted average of these predicted values for subjects with A = 0. Similarly we predict g{E[Y (0, M 1 (1), M 2 (1, M 1 (1)))]} from the outcome model by using the individual's own values of mediators M 1 and M 2 and confounders C in the exposed subjects (A=1), but using A = 0, the opposite of the observed exposure A, and we calculate a weighted average of these predicted values for subjects with A = 1. Confidence intervals can be calculated by bootstrapping. Table 2: Inverse probability weighting approach: example based on two subjects, one exposed (i=1) and the other unexposed (i=2). Bold quantities indicate the unobserved counterfactual values for each subject. A: exposure of the i -subiect, M 1 : first mediator of the i -subiect, M 2 : second mediator of the i -subiect, Y : outcome of the i -subiect. i

Imputation approach
The imputation approach estimates both the marginal and conditional natural direct and indirect effects. We introduce it here focusing on the conditional effects. This approach is based on the so-called natural effects models, i.e. structural models for nested counterfactuals that directly parameterise the natural direct and indirect effects. The natural effects models express the nested counterfactual g{E[Y (a , M 1 (a ), M 2 (a , M 1 (a )))|C = c]} in terms of two newly defined '"exposure" variables A and A (defined below) to compare as follows: where H(a , a , c) is a vector depending on A = a , A = a , C = c and θ is a vector of regression parameters to estimate. For example, θH(a , a , c) could be θ 0 + θ 1 a + θ 2 a + θ 3 a a + θ 4 c. A and A are two variables with the same potential levels of A, and their inclusion in the regression model allows to encode two causal pathways: through neither mediator (i.e direct pathways A → Y ), or through at least one of the two mediators (i.e. indirect pathways . Suppose A is binary with two levels 0 and 1, A and A are also binary and have two potential levels 0 and 1. If both A and A are set to 1, the equation (7) is equal to g{E[Y (1, M 1 (1), M 2 (1, M 1 (1)))|C = c]} = θ 0 + θ 1 + θ 2 + θ 3 + θ 4 c, while, if both A and A are set to 0, the equation (7) is equal to g{E[Y (0, M 1 (0), M 2 (0, M 1 (0)))|C = c]} = θ 0 + θ 4 c. Hence the conditional total effect is equal to: To decompose the total effect, it is necessary to consider scenarios in which A is set to a different value than A . The conditional pure direct effect is equal to: and the conditional total indirect effect is equal to: Similarly to IPW approach, the nested counterfactual g{E[Y (a , M 1 (a ), M 2 (a , M 1 (a )))|C = c]} can be estimated from the observed data when a and a equal the observed exposure A (a corresponds to a and a to a * in the IPW) . When a is equal to the observed exposure A, while a differs from a then g{E[Y (a , M 1 (a ), M 2 (a , M 1 (a )))|C = c]} can still be estimated according to the following equality: It consists of standardising the mean outcome Y in each stratum defined by the mediators M 1 , M 2 and the confounders C among individuals exposed at the level A = a , to the mediator distribution of individuals exposed at the level A = a . In practice (when considering A, M 1 , M 2 , Y binary), we fit to the observed data an imputation model g[E(Y |A = a, M 1 = m 1 , M 2 = m 2 , C = C)] to impute the outcome conditional on A, M 1 , M 2 and C using a generalized linear regression model. The imputation model is used to complete an expansion of the data, in which (i) each observation in the original data set is repeated twice ii) two variables A and A are added, and iii) A is equal to the observed exposure A and A is equal to the observed exposure A for the first replication and equal to the opposite of the observed exposure for the second replication (Table 3). Only when A and A are equal to the observed exposure A the counterfactual outcome g{E[Y (a , M 1 (a ), M 2 (a , M 1 (a )))|C = c]} can be estimated from observed data, otherwise g{E[Y (a , M 1 (a ), M 2 (a , M 1 (a )))|C = c]} can be imputed using the fitted values g{E[Y (a , M 1 (a ), M 2 (a , M 1 (a )))|C = c]} obtained by the imputation model for the outcome with the exposure set to a , the mediators M 1 and M 2 and the baseline covariates C set to their observed values. The imputed outcome is no longer binary, but is substituted by conditional mean imputations. Finally a natural effects model (7) has to be fitted to the imputed data and the conditional effects can be calculated. The estimation of the marginal effects can be performed by weighting the marginal version of the natural effects model g{E[Y (a , M 1 (a ), M 2 (a , M 1 (a )))]} = θH(a , a ) by the reciprocal of the conditional probability of the exposure A given the covariates C estimated using a logistic regression. Confidence intervals can be calculated by bootstrapping.
Table 3: Imputation approach: example based on two subjects, one exposed (i=1) and the other unexposed (i=2). Bold quantities indicate the unobserved counterfactual values for each subject. A: exposure of the i -subiect, M 1 : first mediator of the i -subiect,

Extended imputation approach
The extended imputation approach estimates both the marginal and conditional direct and indirect effects by further decomposing the indirect effect into the effect mediated through M 1 and the effect mediated through M 2 alone. Considering conditional effects, the nested counterfactual g{E[Y (a , M 1 (a ), M 2 (a , M 1 (a )))|C = c]} is now defined in terms of three newly defined "exposure" variables A , A and A (defined below) as follows: where H(a , a , a , c) is a known vector depending on a , a , a , c, and θ is a vector of unknown regression parameters. For example, H(a , a , a , c) could be θ 0 + θ 1 a + θ 2 a + θ 3 a + θ 4 a a + θ 5 a a + θ 6 a a + θ 7 a a a + θ 8 c.
A , A and A are three variables with the same potential levels of A (if A is binary with two levels 0 and 1, then A , A and A have also two hypothetical levels 0 and 1), and their inclusion in the regression model allows to encode the three causal pathways of interest, through neither of the mediators (i.e. the direct pathway A → Y ), through M 1 or M 1 and then M 2 (i.e. the indirect pathway through M 1 : Suppose A is binary, the conditional total effect is equal to: the conditional pure direct effect is equal to: the conditional total indirect effect through the mediators jointly is equal to: the conditional total indirect effect through M 1 is equal to: (16) and the partial total indirect effect through M 2 is equal to: The nested counterfactual g{E[Y (a , M 1 (a ), M 2 (a , M 1 (a )))|C = c]} can be estimated from the observed data when a , a and a equal the observed exposure A. When a , a and a differ one from others, the nested counterfactual can be estimated according to the following equality: which is equal to In practice, we fit to the observed data a model for the probability of either M 1 conditional on A and C or M 2 conditional on M 1 , A and C (according to the researchers' preference), and we fit to the observed data an imputation model for the outcome g[E(Y |A = a, M 1 = m 1 , M 2 = m 2 , C = C)] conditional on A, M 1 , M 2 and C using a generalized linear regression model. The imputation model is used to complete the expansion of the data, in which each observation in the original data set is repeated four times and three variables A , A and A are added to the original exposure variable A. If we are interested in estimating the expression (19), A is equal to the observed exposure level for the first two replications and equal to the opposite of the observed exposure for the third and fourth replications, A is equal to the observed exposure level for all four replications, and A is equal to the observed exposure level for the first and third replications and equal to the opposite of the observed exposure for the second and fourth replication (Table 5) (31)): example based on two subjects, one exposed (1) and the other unexposed (2). Bold quantities indicate the unobserved counterfactual values for each subject. A: exposure of the i -subiect, M 1 : first mediator of the i -subiect, M 2 : second mediator of the i -subiect, Y : outcome of the i -subiect.
i  (32)): example based on two subjects, one exposed (1) and the other unexposed (2). Bold quantities indicate the unobserved counterfactual values for each subject. A: exposure of the i -subiect, M 1 : first mediator of the i -subiect, M 2 : second mediator of the i -subiect, Y : outcome of the i -subiect. i

Summary of the fitted regression models to implement each of the four approaches to sequential mediation analysis
Inverse odds ratio weighting approach. To estimate the conditional total effect we fitted a Poisson regression model to the probability of the occurrence of wheezing between 6 and 18 months of infant life (Y) conditional on depression or anxiety in pregnancy (A) and maternal age, education, residence and body mass index at the beginning of pregnancy, parity and child's sex (C). We assumed no interactions between A and C. To estimate the weights (9), we fitted a logistic regression model of depression or anxiety in pregnancy (A) conditional on adverse reproductive outcomes (M 1 ), occurrence of lower respiratory infection (M 2 ) and maternal age, education, residence and body mass index at the beginning of pregnancy, parity and child's sex (C). We included a term for the interaction betweenM 1 and M 2 . Percentiles confidence intervals were calculated by bootstrap with 1000 repetitions.
Inverse probability weighting approach. To estimate the probability of occurrence of wheezing between 6 and 18 months of infant life (Y), we fitted a Poisson regression model conditional on depression or anxiety in pregnancy (A), adverse reproductive outcomes (M 1 ), occurrence of lower respiratory infections (M 2 ) and maternal age, education, residence and body mass index at the beginning of pregnancy, parity and child's sex (C). We included terms for the interactions between M 1 and M 2 . To estimate the weights in the same expressions, we fitted a logistic regression model on depression or anxiety in pregnancy (A) conditional on maternal age, education, residence and body mass index at the beginning of the pregnancy, parity and child's sex (C). We assumed no interactions between A and C. Percentiles confidence intervals were calculated by bootstrap with 1000 repetitions.
Imputation-based approach. To estimate the imputed nested counterfactuals, we fitted a Poisson regression model to the probability of occurrence of wheezing between 6 and 18 months of infant life (Y) conditional on depression or anxiety in pregnancy (A), adverse reproductive outcomes (M 1 ), occurrence of lower respiratory infections (M 2 ) and maternal age, education, residence and body mass index at the beginning of pregnancy, parity and child's sex (C). We included terms for the interactions between M 1 and M 2 . To estimate the conditional total, direct and indirect effects we fitted a Poisson regression model to the imputed nested counterfactuals conditional on the newly created variables for maternal depression or anxiety in pregnancy (A and A ) and maternal age, education, residence and body mass index at the beginning of the pregnancy, parity and child's sex (C). Similarly, to estimate the marginal total, direct and indirect effects we first calculated the weights P (A|C) using a logistic regression model on depression or anxiety in pregnancy (A) conditional on maternal age, education, residence and body mass index at the beginning of the pregnancy, parity and child's sex (C), and then fitted a Poisson regression model conditional on the newly created variables for maternal depression or anxiety in pregnancy (A and A ) and weighted by the inverse of the estimated probability of A conditional on C. We assumed no interactions between A and C. Percentiles confidence intervals were calculated by bootstrap with 1000 repetitions.
Extended imputation-based approach. To estimate the imputed nested counterfactuals, we fitted a Poisson regression model to the probability of occurrence of wheezing between 6 and 18 months of infant life (Y) conditional on depression or anxiety in pregnancy (A), adverse reproductive outcomes (M 1 ), occurrence of lower respiratory infections (M 2 ) and maternal age, education, residence and body mass index at the beginning of the pregnancy, parity and child's sex (C). We included terms for the interactions between M 1 and M 2 . To estimate the weights on M 1 , we fitted a Poisson regression model to the probability of the adverse reproductive outcomes (M 1 ) conditional on the newly created variables for maternal depression or anxiety in pregnancy (A and A ) and maternal age, education, residence and body mass index at the beginning of the pregnancy, parity and child's sex (C). To estimate the weights on M 2 , we fitted a Poisson regression model to the probability of the occurrence of lower respiratory infection (M 2 ) conditional on adverse reproductive outcomes (M 1 ) and the newly created variables for maternal depression or anxiety in pregnancy (A and A ) and maternal age, education, residence and body mass index at the beginning of the pregnancy, and parity (C). To estimate the conditional total, joint direct and joint and partial indirect effects, we fitted a Poisson regression model to the imputed nested counterfactuals conditional on the newly created variables for maternal depression or anxiety in pregnancy (A , A and A ) and maternal age, education, residence and body mass index at the beginning of the pregnancy, parity and child's sex (C), weighted by the weights described above. The marginal effects were calculated by further weighting by the inverse of the probability of A conditional on C estimated using a logistic regression. Percentiles confidence intervals were calculated by bootstrap with 1000 repetitions.